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Abstract. - A model for growing information networks is introduced where nodes receive 
new links through j-redirection, i.e. the probability for a node to receive a link depends on 
the number of paths of length j arriving at this node. In detail, when a new node enters the 
network, it either connects to a randomly selected node, or to the j'-ancestor of this selected 
node. The j-ancestor is found by following j links from the randomly selected node. The 
system is shown to undergo a transition to a phase where condensates develop. We also find 
analytical predictions for the height statistics and show numerically the non-trivial behaviour 
of the degree distribution. 



Motivation. - It is well-known that large networked information systems (e.g. citation 
networks or the Web) are explored by following the links between items [1]. This process is 
at the heart of common search engines like Google, and is based on the empirical observation 
that an individual surfing the Web will typically follow of the order of 6 hyperlinks before 
switching to an unrelated site [2] . Practically, search engines mimic this behaviour by sending 
"random walkers" who, part of the time, follow links between websites, and otherwise jump to 
a randomly selected website in the network. The average number of walkers at a given node is 
the measure of the importance of the node in the network (e.g. the Google Rank number). In 
view of this search mechanisms, one expects that nodes with a higher density of walkers are 
visited more often, and should therefore receive more links from newly introduced nodes. This 
feed-back mechanism leads to an increase of the selected node degree, in a manner that may 
naively remind one of preferential attachment [3], as well as its density of walkers, thereby 
increasing the probability of the selected node to be chosen in the future, etc... 

In its most basic form, a growing network with redirection is defined as follows: a node 
enters the system, first connects to a target node (chosen randomly in the whole network) and 
then, with some probability p, redirects its link to the ancestor of the target node. This model 
is well-known [4] to lead to linear preferential attachment in the network, and to reproduce 
the formation of fat tail degree distributions k~ u , with v = 1 + l/p. However, more realistic 
situations where the entering node follows j ^ 1 links before connecting to a node (see Fig.l) 
have not been considered yet. From now on, we call this recursive exploration of the network 
j-rcdircction. In the following, we will mainly focus on the 2-redirection case and restrict the 
scope to networks where nodes have only one outgoing link. We will show how this slight 
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Fig. 1 - Sketch of a time step of the model with 1-redirection or 2-redirection. The system is first 
composed of 7 nodes. A new node, labelled with 8, enters the network and randomly selects the 
marked node (node 4). If 1-redirection takes place, the entering node connects to the father of the 
marked node with probability p. If 2-redirection takes place, the entering node connects to the grand- 
father of the marked node with probability p. Otherwise, the entering node connects to the marked 
node. 



generalization leads to much more complicated situations than in the case j = 1, such as the 
formation of condensates in the network. 

Basic model. - Let us first study the simplest version of the model where entering nodes 
explore the network with 1-redirection. Initially (t = 0), the network is composed of one 
node, the seed, and each time step t, a new node enters the network. Consequently, the total 
number of nodes is equal to N = 1 + 1, and the number of links is L = t. We will focus on 
the height distribution, the height of a node [5] being defined to be the minimum number of 
links to the seed. The probability that a node at the depth g in the directed network receives 
the link is: 



Pg~0—p)N g +pN g+ l, (1) 

except for the seed g = 0: 

P ~N +pN u (2) 
where N g is the average number of nodes at depth g. The normalisation follows: 

oo 

N + pA^i + - p)Ni + pN l+1 ] = N. (3) 

i=l 

Putting the above pieces together, it is straightforward to show that the rate equation for N g 
reads in the continous time limit: 

d t Ni ;t = ^(No+pNi) 



R. Lambiotte and M. Ausloos: Growing network with j-redirection 



3 




d t N a , t = -[(1 -p)N g _ 1+P N g ]. (4) 

As a first level of description, we derive an equation for the average total height G = Y^=o 9-^9 
from EqHJ that reads in the long time limit t > 1: 

d t G = + (5) 



This equation leads to the asymptotic behaviour G ~ (1 — p)t\nt, from which one recovers 
the behaviour G ~ t In t taking place when the model is purely random (no redirection) . 
Consequently, the redirecting process slows down the growth of the network. This is expected 
as redirection favours the connection to nodes closer to the seed. In the limiting case p = 1, 
where the process is easily shown to lead to a star network (i.e. all the nodes are connected 
to the seed), one finds G ~ t. 

Condensation in the 2- redirection model. - Let us now focus on the more challenging 
case when the network is explored with 2-redirection (see Fig. 2). The generalization to any 
value of j > 1 is straightforward and will be briefly discussed at the end of this section. The 
probability that a node at the depth g in the directed network receives the link is: 

P g ~ (l-p)N g + P N g+2l (6) 

except for the seed, where: 

Po^No+pNi+pNz, (7) 
and where the normalization follows: 

oo 

N + pNt + P N 2 + P) N i + P N i+2] = N. (8) 

i=l 

The rate equation for N g and the equation for the average G are respectively: 



d t N 1;t = —(No+pNi+pNt) 
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dtN git = ^[(l-p)N g _ 1 +pN g+1 ], (9) 

and 

d t G = \pn 1 + (l-2p) + j], (10) 

where n g = N g /N is the proportion of nodes at height g. There are obviously two possible 
cases, i) If n\ is vanishingly small in the long time limit, Eg 11 01 simplifies into 

d t G=(l-2p) + j, (11) 

whose solution is G ~ (1 — 2p)t\nt. This solution suggests that a qualitative change occurs 
around p c = 1/2. ii) If n\ does not vanish in the long time limit, this term has to be taken 
into account. Let us stress that a finite value of m implies the formation of a condensate in 
the network, i.e. the seed attracts a non- vanishing fraction of the links in the network [6-9]. 

Let us evaluate the values of p for which such a condensate exists and the corresponding 
value of ri\. To do so, one needs to find stationary solutions to the equations for n g : 

(l+t)d t ni = (n +pni + pn 2 ) ~ «i 

(l + t)d t n g = [(1 -p)n g -i +pn g+ i] - n g . (12) 

The stationary solution are found by recurrence and by using the fact that Uq is negligible 
in the long time limit. Indeed, iVo is (and remains) equal to 1 by construction, so that 
Nq/N = 1/N — * 0. It is straighforward to show that the stationary solution is in general: 



1 / 1 — p x 



c V v 



(13) 



whose normalisation constant is G = J^^ii^r) 9 1 - Consequently, the system reaches a 
stationary solution when p > 1/2 and < 1, so that C = 2 p- \ • Otherwise, the probability 
normalisation is not satisfied. 

By inserting the above solution n\ = v ~ into Ea llOi one arrives at the trivial evolution 
equation 

dtG = j, (14) 

so that the average height G/t asymptotically goes to a constant, in agreement with the 
observed formation of condensates. Before focusing on the regime p < 1/2, let us stress 
that the existence of non- vanishing stationary values of n g is not possible in the 1-redirection 
model. In contrast, the formation of condensates takes place for any other j-redirection j > 1. 
This result is straightforward after generalizing EqflOlinto: 

d t G = [ p J2(j-9)n g + (l-jp) + j}, (15) 

3=1 



from which one finds that the transition occurs at p c = l/j. 
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Fig. 3 - In the left figure, relation between the index of the first negative amplitude F(/3) and the 
possible eigenvalue /3. The results are obtained by numerically integrating Eq |17l up to g max = 5000. 
The method shows that a whole region of f3 exists where all A g are positive. Let us stress that F(J3) 
is limited to the maximum value g m ax by construction. In the right figure, observed value of (3(p) 
obtained by integrating numerically the dynamical equations Eq |I6l At r = 200, the derivatives 
Di — d(\nNi) j 'dr are measured and are shown to be independent of i. Results are compared with 
those obtained with the first negative amplitude approach. The small discrepancies are due to non- 
stationary effects, i.e. at r = 200, the system has not yet reached its asymptotic state. The dotted 
line is the theoretical prediction Eq |2f I 



p<l/2. - It is useful to introduce the time scale dr — dt/(l + 1) (t ~ log(t)) in which 
the set of equations to solve reads: 

5 r A^i = (N + P N l +pN 2 ) 

d T N g = [(1 -p)N g _ 1 +pN g+1 }. (16) 

This is a linear and homogeneous set of equations, so that one expects the solutions to have 
a time dependence e^ T ~ t 13 , where (3 is an eigenvalue of the dynamics . In the case p > 1/2, 
we have shown above that /3 = 1 is a proper eigenvalue and found the eigenvector Eq |13l In 
the following, we look for the solution j3(p) that is reached when p < 1/2. Solving the whole 
spectrum of eigenvalues of the matrix dynamics is out of question. Instead, we introduce the 
ansatz Ni = Ait 13 and look for the solutions Af. 

(3 A x = (pAi+ P A 2 ) 

[3A g = [{i-p)A g _ 1 +A g+lP }, (17) 
which can be solved by recurrence: 

A 2 = tlE Al 
P 

A priori, any value of f3 e]0, 1[ is available, except those for which any of the amplitudes 
Ai becomes negative. In order to evaluate the values of (3 that respect this condition, we have 
integrated numerically the above recurrence relations and looked, at a fixed value of p, for the 
relation F(j3), where F is the index of the first amplitude Ap that becomes negative, so that 
no amplitude A g is negative. By construction, F(f3) should go to infinity for allowed values 
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of p. Numerical integrations (FigJ3^) show that a whole region of j3 < /3 C are excluded due 
to this non-negativity constraint. In contrast, any value /3 > (3 C keeps all Ai positive and is 
a priori susceptible to be chosen. However, numerical integration of Eq |17l suggest that only 
this value /3 C is selected by the dynamics (Fig|3jD). In the limiting case p — > 1/2, the value 
P = 1 is recovered. 

Let us try to evaluate analytically the location of the transition. To do so, we focus on 
the relation 



(3A g = [(l-p)A g ^+A g+1 p] (19) 

for large values of <?, assume that A(g) is continuous and keep only the leading terms A g +i = 
A g + A g + \/2A g . In this case, the recurrence relation recasts into the following homogeneous 
differential equation: 

A g - 2(1 - 2p)A g + 2(1 - 0)A g = 0. (20) 
It is straightforward to show that the solutions of this equation undergo a transition at: 

Above this value, the amplitude A g is definite positive and asymptotically behaves like en ex- 
ponential A g ~ e (i-2j>WA)g ; where A = —1 + 2/3 — 4p + 4p 2 . Below this value, in contrast, the 
solution exhibits an oscillatory behaviour A g ~ e( 1_2p ) 9 e 179 , with 7 = y/l — 2(3 + 4p — Ap 2 . 
Consequently, this solution exhibits negative solutions, i.e. these values of /3 are forbidden. 
Comparison of this theoretical prediction with the numerical results (FigJSJj) shows an excel- 
lent agreement, at least for small values of 1/2 — p. Let us stress, though, that deviations 
from this continuous approximation take place for large values of 1/2 — p. Indeed, (3 C goes to 
1/2 in the limit p — > 0, while one expects (and measures by integrating Eq.17) that (3 C should 
go to zero. 

Degree distribution. - As soon as j-redirections with j > 1 arc introduced, the model 
exhibits complications in order to derive a closed equation for the degree distribution. This 
is due to the fact that a 2-variable distribution for the degrees of the nodes at the extremities 
of one link [4] has to be added in order to account for the 2-redirections. Similarly, once 
one tries to write an equation for that distribution, the distribution involving three degrees 
characterizing two adjacent links has to be considered, etc., leading to an infinite hierarchy. 
A mean field description through a truncature of the hierarchy at some level, even though 
possible in principle, has not been fulfilled yet and remains an open problem. In the following, 
we restrict the scope to a numerical analysis of the degree distribution. To do so, we perform 
50 computer realizations of the random process, measure the degree distribution after long 
times t > 10 6 and average over the many realizations. 

Computer simulations show (FigJ^J) that the distribution reaches a stationary distribution 
except for a peak in its tail that advances in time. One observes that this peak velocity 
is ~ t@ c , with P c (p) defined above. This result is expected as the average seed degree is 
Ni and that this quantity grows like t^ c . Moreover, the stationary part of the distribution 
converges toward a power-law kr v for large values of k. We have verified the stationarity of 
this asymptotic state by measuring the degree distributions at different times t. 

Conclusion. In this Letter, we have focused on a simple model of growing directed 
networks, where the probability for a node to receive a link depends on the number of paths 
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Fig. 4 - In the left figure, degree distribution measured from simulations with p = 0.4 at 2 different 
times t = 2 10 6 and t = 2 10 7 . In the right figure, power-law exponent v of the tail of the distribution 
k~ v . The empirical result is compared with the theoretical result for the 1-redirection model 1 + 1/p. 



of length j arriving at this node. This process, that we called j-redirection, generalizes a 
redirection process known to lead to preferential attachment [4] and mimics the way people 
explore the Web. We have shown that when j > 2, the system undergoes a transition to a 
regime where condensates develop around the seed node. Condensates are nodes that receive a 
non-vanishing fraction of the links when the number of nodes N goes to infinity. Let us stress 
that such states have been observed in other types of model [6-9], and that such winner-takes- 
all phenomena are associated to extreme configurations of the network, where a monopoly- 
like configuration develops. We have also focused on the degree distribution arising in such 
systems. Computer simulations show that the degree distribution asymptotically reaches an 
almost stationary state, where only the degree of the seed makes the solution unstationary. 
The stationary part is shown to converge to a power-law distribution, ft is remarkable to 
note that the exponents belong to the interval [2,3] for most of the values of the redirecting 
probability p. Let us stress that this effect reminds the properties of another model with 
redirection [10]. The mechanism that we propose could therefore give an explanation for the 
proliferation of exponents in that interval [11,12] in many empirical studies, e.g. collaboration 
networks [13], the Web [14]... To conclude, we would like to insist on the generality and 
simplicity of our approach, that is shown to exhibit a complex phenomenology. As a next 
step, analytical predictions for the degree distributions, based on mean field assumptions, 
should be considered in order to improve our knowledge of the model. 
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